A review of weighting schemes for bag of visual words image retrieval
نویسندگان
چکیده
Current studies on content-based image retrieval mainly rely on bags of visual words. This model of image description allows to perform image retieval in the same way as text retrieval: documents are described as vectors of (visual) word frequencies, and documents are match by computing a distance or similarity measure between the vectors. But instead of raw frequencies, documents can also be described as vectors of word weights, each weight corresponding to the importance of the word in the document. Although the problem of determining automatically such weights, and therefore which words describe well documents, has been widely studied in the case of text retrieval, there is very little litterature applying this idea to the case of image retrieval. In this report, we explore how the use of standard weighting schemes and distance from text retrieval can help to improve the performance of image retrieval systems. We show that there is no distance or weighting scheme that can improve performance on any dataset, but choosing weights or a distance consistent with some properties of a given dataset can improve the performance up to 10%. However, we also show that in the case of very varied and general datasets, the performance gain is not significant. Key-words: Weighting schemes, content-based image retrieval, bag of visual words, information retrieval, vector space model, probabilistic model, Minkowski distance, divergence from randomness, BM25, TF*IDF Un aperçu des schémas de pondération pour la recherche d’images par sac de mots visuels Résumé : Les travaux actuels en recherche d’image par le contenu se basent sur un modèle en sac de mots visuels. Ce modèle permet de rechercher des images d’une façon très similaire à celle de la recherche textuelle : les documents sont représentés par des vecteurs de fréquences de mots (visuels), et sont appariés en calculant une distance entre ces vecteurs. Au lieu de comparer des fréquences brutes, il est possible d’utiliser des vecteurs de poids, chaque poids représentant l’importance d’un terme dans un document. Bien que de nombreuses méthodes pour déterminer automatiquement ces poids, et donc l’importance des termes, ont été développées pour les documents textuels, il existe peu de travaux équivalents dans le domaine de la recherche d’image. Dans ce rapport, nous étudions l’influence que peuvent avoir les schémas de pondération et les distances classiques de la recherche d’information textuelle sur les performances des systèmes de recherche d’images. Nous montrons qu’il n’existe pas de pondération ni de distance qui soit optimale sur n’importe quelles données, mais aussi que le choix d’une distance ou d’une pondération en adéquation avec les propriétés des données étudiées permet d’augmenter les résultats jusqu’à environ 10%. Cependant, nous montrons aussi que dans le cas de grandes collections d’images variées, les améliorations obtenues ne sont pas significatives. Mots clés : Pondération, recherche d’image par le contenu, sac de mots visuels, recherche d’information, modèle vectoriel, modèle probabiliste, distance de Minkowski, divergence from randomness, BM25, TF*IDF * Texmex Project, CNRS ** Texmex Project, CNRS *** Texmex Project, INRIA Rennes c ©IRISA – Campus de Beaulieu – 35042 Rennes Cedex – France – +33 2 99 84 71 00 – www.irisa.fr in ria -0 03 80 70 6, v er si on 1 4 M ay 2 00 9 2 Pierre Tirilly, Vincent Claveau, Patrick Gros [email protected], [email protected], [email protected]
منابع مشابه
A Novel Method for Content Base Image Retrieval Using Combination of Local and Global Features
Content-based image retrieval (CBIR) has been an active research topic in the last decade. In this paper we proposed an image retrieval method using global and local features. Firstly, for local features extraction, SURF algorithm produces a set of interest points for each image and a set of 64-dimensional descriptors for each interest points and then to use Bag of Visual Words model, a cluster...
متن کاملA Novel Method for Content Base Image Retrieval Using Combination of Local and Global Features
Content-based image retrieval (CBIR) has been an active research topic in the last decade. In this paper we proposed an image retrieval method using global and local features. Firstly, for local features extraction, SURF algorithm produces a set of interest points for each image and a set of 64-dimensional descriptors for each interest points and then to use Bag of Visual Words model, a cluster...
متن کاملVisual word proximity and linguistics for semantic video indexing and near-duplicate retrieval
Please cite this article in press as: Y.-G. Jia Vis. Image Understand. (2008), doi:10.101 Bag-of-visual-words (BoW) has recently become a popular representation to describe video and image content. Most existing approaches, nevertheless, neglect inter-word relatedness and measure similarity by bin-to-bin comparison of visual words in histograms. In this paper, we explore the linguistic and onto...
متن کاملSpatial Weighting for Bag-of-Visual-Words and Its Application in Content-Based Image Retrieval
It is a challenging and important task to retrieve images from a large and highly varied image data set based on their visual contents. Problems like how to fill the semantic gap between image features and the user have attracted a lot of attention from the research community. Recently, the 'bag of visual words' approach exhibits very good performance in content-based image retrieval (CBIR). Ho...
متن کاملFrom Text to Images: Weighting Schemes for Image Retrieval
Bags of visual words are the most studied image description technique in the last years. This representation of images raised new possibilities as well as new research issues. In particular, it is important to automatically determine which visual words are the most relevant to describe the images, and which ones should be ignored. This issue is a classical problem of textual information retriev...
متن کامل